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Abstract 

It is well known that, when normalized by n, the expected length of a longest common subsequence of d sequences 
of length n over an alphabet of size a converges to a constant 7o-,d- We disprove a speculation by Steele regarding 
a possible relation between 72, d and 72,2- In order to do that we also obtain some new lower bounds for 'ya,d, when 
both a and d are small integers. 

1 Introduction 

String matching is one of the most intensively analyzed problems in computer science. Among string matching 
problems the longest common subsequence problem (LCS) stands out. This problem consists of finding the longest 
subsequence common to all strings in a set of sequences (often just two). The LCS problem is the basis of Unix's 
dif f command, has applications in bioinformatics, and also arises naturally in remarkably distinct domains such as 
cryptographic snooping, the mathematical analysis of bird songs, and comparative genomics. In addition, the LCS 
problem offers a concrete basis for the illustration and benchmarking of mathematical methods and tools such as 
subadditive methods and martingale inequalities; see for example Steele's monograph [Ste86]. 

Although the LCS problem has been studied under many different contexts there are several issues concerning it 
that are still unresolved. The most prominent of the outstanding questions relating to the LCS problem concerns the 
length Ln,a.d of a LCS of d sequences of n characters chosen uniformly and independently over some alphabet of 
size a. Subadditivity arguments yield that for fixed d and n going to infinity, the expected value of Ln,a.d normalized 
by n converges to a constant ^a.d- For d, > 2, the precise value of ^cr,d is unknown. The constant 72,2 is referred 
to as the Chvatal-Sankoff constant. The calculation of its exact value is an over 3 decades old open problem. The 
determination of its value has received a fair amount of attention, starting with the work of Chvatal and Sankoff [CS75], 
encompassing among others [Dek79, Ale94, DP95, Dan98, BYNGS99, Lue03], and is explicitly stated in several 
well known texts such as the ones by Waterman [Wat95, § n.1.3], Steele [Ste96, p. 3], Pevzner [PevOO, p. 107], 
and Szpankowski [SzpOO, p. 109]. To the best of our knowledge the current sharpest bounds on 72 2 are due to 
Lueker [Lue03] who estabHshed that 0.788071 < 72.2 < 0.826280. 

The starting point for this investigation is the following comment by Steele [Ste86]: 

"It would be of interest to relate C3 to c^, and one is tempted to speculate that C3 — c? (and more generally 
that Cfc = cJ'^^). Computational evidence does not yet rule this out." 

Here, Steele uses c to denote the limiting value of the longest common subsequence of two random sequences of length 
n normalized by n as ri goes to infinity, and in general, he uses cj. to denote the analogous constant for k sequences. 
However, it is unclear if in this comment he uses c and to denote the constants 72.2 and 7^, 2 (i-e. specifically for the 
case of alphabet size 2) or if he is generically denoting the constants for arbitrary alphabet size. Dancik [Dan98] cites 
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the previous statement as a conjecture by Steele using the second interpretation, i.e., as the claim that for all d > 3 and 

cr > 2, 

la.d = 7^,2' . (1) 

Dancik [Dan98, Theorem 2.1, Corollary 2.1] shows that for d>2 

1 < liminf (T^"^/'^7^,d < limsupcr^"^/'^7^,d < e . 

Hence, if (1) was true, then for e > and a sufficiently large, 

I - e <(j ' -icr,d = o- ' 7^ 2 ^ ' 

Dancik's results disprove (1) by observing that for > 2 one may choose a large enough so as to make the rightmost 
term of the last displayed equation arbitrarily close to 0. 

If we use the first interpretation of Steele's speculation quoted above, i.e., considering only the case of binary 
alphabets as we believe it was intended, then (1) is not invalidated by Dancik's work. 

In [Ste86], Steele does not justify his speculation. The following non-rigorous argument gives some indication 
that one should expect that 72,3 is strictly bigger than ^22- Indeed, let Ai, A2 and A3 be three independently and 
uniformly chosen binary sequences of length n. For i ^ j and very large values of n one knows that a longest common 
subsequence of sequences Aj and Aj would be of length approximately 72.2"^- One would expect (although we 
can not prove it) that £ij would behave like a uniformly chosen binary string of length 72, 27^- Sequences £12 and £2,3 
are clearly correlated. However, one might guess that the correlation is weak (again, we can certainly neither formalize 
nor prove such a statement). The previously stated discussion suggests that a longest common subsequence ^1.2,3 of 
£1,2 and ^2,3 should be of length approximately 71 2''^- Since ^'1,2,3 is clearly a longest common subsequence of Ai, 
A2 and A3, one is led to conclude that 

72,3 > 72,2 • (2) 

However, there are two good reasons why one suspects that this last inequality should be strict; 

• Since £2.3 has only a fraction of Aa's length, one expects that a longest common subsequence of £1.2 and A3 is 
significantly larger than a longest common subsequence of £1^2 and £2.3- 

• The longest common subsequence of Ai, A2 and A3 might arise by taking a longest common subsequence on 
sub-optimal common subsequences £{ 2 and £'2 3 of Ai and A2, and A2 and A3, respectively. 

This work's main contribution is to show that the inequality in (2) is indeed strict. 

In Section 2 we give a simple argument that proves that when a is fixed and d is large the identity j^-.d = 7^ 2^ 
does not hold. The underlying argument is essentially an application of the probabilistic method. However, it might 
still be possible that the relation would hold for some specific values of a and d. Of particular interest is the case of 
binary sequences, i.e. tr = 2. In Section 3 we show that even this weaker identity does not hold, i.e. that 72.3 7^ 7! 2- 
To achieve this goal, we rely on Lueker's [Lue03] U = 0.826280 upper bound on 72,2 and determine a lower bound 
on 72,3 which is strictly larger than U'^ > 71 2- The lower bound on 72,3 is obtained by an approach similar to the 
one used by Lueker [Lue03] to lower bound 72,2, although in our case we have to consider a non-binary alphabet. 
Aside from the extra notation needed to handle the cases a,d > 2, our treatment is a straightforward generalization 
of the approach used by Lueker (In fact, in order to keep the exposition as clear as possible we do not even use the 
optimization tweaks implemented by Lueker in order to take advantage of the symmetries inherent to the problem and 
objects that arise in its analysis.) We conclude with some final comments in Section 4. 

2 Disproving 7cr,d = 7^2^ f^*" large d 

We start this section by introducing some notation. Given strings Ai, . . . ^Ad of length n, we denote by L{Ai, . . . , Ad) 
the length of the longest common subsequence of all A,;'s. LetUn ri be the distribution of sequences of length n whose 
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characters are chosen uniformly and independently from S = {1, . . . , a}. We denote by L„M,d the random variable 
L{Ai, . . . , Ad) when all the Ai are chosen according toUn,a- Finally, we let ^^.d denote the limit of EL„_cr.d/fT- when 
n ^ oo (the existence of this limit follows from standard subadditivity arguments [CS75]). 

In what follows, we give a lower bound for 7^.^ that is independent of d. This bound is based on the following 
simple fact: If X is chosen according to Un.<y and n is large, then the number of occurrences of a fixed character 
in E is roughly n/a. Intuitively, this means that for a set of d random strings of (very large) length n, with very high 
probability a sequence formed by roughly \ n/a\ equal characters will be a common subsequence of all the d random 
strings. 

Lemma 1 For all d and a, we have ja-,d ^ 1 /c- 

Proof: Let Ai, . . . , A^ he d independent random strings chosen according to Un^a- Let Xi denote the number of 
times the character c e E appears in Ai, and X ~ mm{Xi, . . . , Xd}- The string formed by X copies of the 
character c is a common subsequence of all Xi's. It follows that L{Ai, . . . , Ad) > X. 

Each Xi is a binomial variable with parameter p = 1/a. By a standard Chernoff bound [JLROO, Remark 2.5] we 
have that for any < e < 1, 

Pt[X, < (1 - e)np] < exp(-2n(pe)2). 
Applying Markov's inequality, and recalling that the Xi's are independent, it follows that: 

EX > (1 - e)npPi[X > (1 - e)np] > (1 - e)np[l - cxp{~2n{pe)^)f . 

Letting n be sufficiently large so that [1 — cxp{—2n{pe)'^)]'^ > (1 ^ 2e)/(l — e), we obtain EX > np{l — 2e). 
Therefore: 

ELn.a.d ELiAi,...,Ad) ^EX l-2e 

— — > > (1 — 2e)p = . 

n n n a 

It follows that 7o- d > (1 — 2e)/(7. Since this is true for any e > 0, we conclude that ^^r.d >!/'''■ ^ 

It is now easy to disprove that 70-. d — 7^^^ for large d. Indeed, since < 1 [CS75], then Vmid^oo 7^^^ = 0. On 
the other hand, the previous lemma asserts that ^cr,d > l/c for all d, hence for d large enough, 7^^^ < "fa.d- 

In particular, for the case <t = 2, Lueker [Lue03] proved that 72.2 < U for U = 0.826280. Thus, for all d > 5, we 
have the strict inequality 

72^2^ < (0.826280)''-i < 1/2 < 72,^- 

3 Disproving 72 3 = 72,2 

3.1 Diagonal common subsequence 

As already mentioned, the best known provable lower bound for 72.2 found so far is due to Lueker [LueOB]. The 
starting point of Lueker's lower bound technique is a result by Alexander [Ale94] who related the expected length of 
the LCS of two random strings of the same length ri, to the expected length of the LCS of two random strings whose 
lengths sum up to 277,. Below, we establish an analog of Alexander's result but for the case of d randomly chosen 
sequences. 

Let C[j..k] denote the substring C[j]C[j + 1] . . . C[k] formed by all the characters between the j-th and k-th 
positions of C. Given strings Ai, . . . ,Ad of length at least n, we say that B is an n-diagonal common subsequence of 
Ai, Ad if B is a common subsequence of a set of prefixes of Ai, Ad whose lengths sum to ji, i.e., if for some 
indices ii, . . . ,id such that ii + ■ ■ ■ + id = n, the string S is a common subsequence of Ai . . . , [1 ■ 

Let Dn{Ai, . . . , Ad) denote the length of a longest n-diagonal common subsequence of the strings Ai, . . . , Ad- 
We denote by Dn.a,d the random variable D„(^i, . . . , Ad) where the strings Ai, . . . ,Ad are chosen according to 

The main objective of this section is to prove the following extension of a result of Alexander [Ale94, Proposi- 
tion 2.4] for the d = 2 case: 
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Theorem 1 For all n> d. 

In particular, for all a there exists S^^d such that: 

E,Dn.cr,d la.d 

= lim '■ — = — — . 

71^00 n a 

For the sake of clarity of exposition, before proving Theorem 1 we estabHsh some intermediate resuhs. 
Lemma 2 For all n and d, 'ELn.a,d < EZ3„d,cr,d- 

Proof: Let Ai, . . . ,Adhe random strings independently chosen according to Und,a- Since a longest common subse- 
quence of Ai[l..rt], . . . , is also an rtd-diagonal common subsequence of ^1, ... , Ad, 

L(Ai[L.n], . . .,Ad[l..n]) < D„d{Ai, . . .,Ad). 

Taking expectation on both sides of the previous inequality yields the desired conclusion. ■ 

Lemma 3 For all n> d, 

d ■ EAi^a.d - d3/V2nlnn < EL„,,,d. 

Proof: Let ^1 , . . . , be a list of words of length n. Note that if we change one character of any word in the list, then 
the values L{Ai, . . . , Ad) and Dn{Ai, . . . , Ad) will change by at most one unit. It follows that the random variables 
Ln.a,d and Dn.a,d (sceu as functions from (S")'' to R) are both 1-Lipschitz. Applying Azuma's inequality (as treated 
in for example [JLROO, § 2.4]) we get: 



Pr 



V2 



< cxp 



2(n/2) 
nd 



< 



d + r 



where the last inequality holds since e ^ < l/{x + 1) for all a; > 0. 

Let A = 'EDn^aM — \/n/2. Since Dn^a,d > A impUes that there are positive indices ii, . . . ,id such that + - ■ ■ + 
id = n and L{Ai[l..ii], . . .,Ad[l..id]) > A, 

Pr[i?„...d > A] < Pr[LiAi[l..h],...,Ad[l..id])> X]. 

t)<ii,...,ia<n, 
ii + ...+id=n 

Let / be the number of summands in the right hand side. Note that / = ('jZi) since it counts the the number of ways 
of partitioning n into d positive summands. It follows that there exist positive ji, . . . , jd summing to n such that: 



Pr[L(Ai[L.ji], . . . , Ad[l..jd]) > A] > - ( 1 - 



d+1 



1 



Note that the distribution of the random variable . . . , ^c;[l..jc;]) is the same as the distribution of 

L(Ai [l..jV(i)], . . . , Ad[l-.jT{d)]) for any permutation r : [d] — > [d]. It is also easy to see that the distribution of 
L{Ai[ai..bi], Ad[ad-bd\) and L{Ai[a[..b[], Ad[a'^..b'J) are the same when 6^-0™ = b'^^ - a'„^ for all 

1 < 771 < d. 

Now, let r be the cyclic permutation (12 ... d) and for < to < d — 1 let £m denote the event 



L A, 



'(1) 



1=0 



1=0 



1 Aa 



(d) 



1=0 



1=0 



> A. 



In particular, Eq is the event {L(Ai [l..ji], . . . , Ati[l..jti]) > A} whose probability was bounded above. Note that 
the events £q, . . . , £d-i are equiprobable. Since each of the ^m's depends on a different set of characters, they are 
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independent. Moreover, if £q, . . . , £d-i simultaneously occur, then by concatenating the common subsequences of 
each block of characters we get that L{Ai , . . . , Ad) > dX. Hence, 



I{d 



d d-1 



dX] 



m=0 



Applying Azuma's inequality again, we have: 



nd^ ln(/(rf+ 1)) 



< 



1 



I{d+l) 



Combining (3) and (4) and recalling that A — EZ)„_o-.d — \/n/2 we obtain; 

Pr Ln rr.d > EL 

ri..rr.f}. 



nd^ ln(/(d+ 1)) 



< Pr 



Ln,a,d > dEDn,a,d — dyj — 



Hence: 



xd2ln(/(d+l)) 



> d^Dr, 



dJ-. 



Since 2 < d < n, {d + 1)1 = {d + l)('jZj) < n'^, and so: 



pn nd^ln(I(d+l)) ; -— — 

dEDn^.^d < EL„,^,d + '^V 2 ^ V 2 " ^^"'"-'^ + d'^'W^rilnin). 



(3) 



(4) 



Proof of Theorem 1: Lemmas 2 and 3 already give the bounds on EL„ o-.d- 

To complete the proof we need to show that lim„^oo ED„_o-.ci/" exists and that its value is ^a,d/d. By Lemmas 2 



and 3 we have: 



EL„,<,,d < ED„d,aM < ^ELnd,aM + d^ ^ ^ ^/ 2nd \n{nd) . 



Dividing by n, it follows that lim„^oo 'EDnd.a-.d/n — ^a.d- Furthermore, 'EiDn.cr,d is non decreasing in n, so: 

\n/d\ 'EiDdyn/d\,<y4 ^ ^Dn.a,d ^ \n/d'\ ^Dd\n/d'\.aA 



n/d 



\n/d\ 



n/d 



n/d 



\n/d\ 



Since both the left hand side and right hand side terms above converge to ^cr,d when n oo, the middle term also 
converges to that value, and so lim„^oo 'EDn.a.d/n = Ja.d/d as claimed. ■ 



3.2 Longest common subsequence of two words over a binary alphabet 

In this section we describe Lueker's [Lue03] approach for finding a lower bound on jd.a- when d — a — 2. Later on, 
we will generalize Lueker's technique to the cases of arbitrary d and a. 

Let Xi and X2 be two random sequences chosen from Un.2, i-C- strings of length n such that all their characters 
are chosen uniformly and independently from the binary alphabet {0,1}. Lueker defines, for any two strings A and B 
over the binary alphabet, the quantity 



Wn{A,B)^-E 



max L{AXi[l..i],BX2[l..j]) 



Informally, Wn {A, B) represents the expected length of a LCS of two strings with prefixes A and B respectively and 
suffixes formed by uniformly and independently choosing n characters in {0, 1}. It is easy to see that Wn{A, B) 
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behaves as D„.2,2 as n ^ oo. Hence, applying Alexander's d ~ 2 version of Theorem 1, Lueker observes that for all 

A, Be {0,1}*,' 



lim 

n — ^oo 



A natural idea is to approximate 72.2 by W2n B) jn. Fix the length / € N of the strings A and B and denote by Wn 
the 2^' dimensional vector whose coordinates correspond to the values Wn{A, B) when A and B vary over all binary 
sequences of length I. For example, when / = 2 the vector w„ has the following form: 



/zi;„[00,00]\ 
w;„[00,01] 

u;„[ll,10] 
Vu;„[ll,ll]y 



/W„(00,00)\ 
W„(00,01) 

W„(ll,10) 
VW^„(11,11)/ 



Lueker established a lower bound for each component of w„ as a function of the components of Wn-i and u'„_2- 
To reproduce that lower bound, we need to introduce some more notation. If A ~ A[\\A[2\ . . . A\l] is a sequence 
of length I > 2, let h{A) denote the head of A, i.e. its first character, and let T{A) denote its tail, i.e. the substring 
obtained from A by removing its head. In other words, h{A) = A[l] and T{A) = A[2..l]. It is easy to see that the 
following relations among Wn,Wn-i and Wn-2 hold: 



lfh{A) ^ h{B), then 



[A,B]>l + l "^"-2 iT{A)c, T{By] . 



(c,c')G{0,l}^ 



lfh{A) ^ h{B), then 



Un[A,B] > - max <^ Wn-i[TiA)c,B], Y Wn-i[A,T{B)c 

[ce{0,l} cG{0,l} 



Using the previous inequalities one can define a function F : x R^ such that for all n > 2, we have 

Wn > F{wn^i, Wn~2)- Furthermore, the function F can be decomposed in two simpler functions F= and F^ such 
that if n= and 11^ are the projections of the vectors onto the coordinates corresponding to the pairs of words with the 
same and different heads respectively, then: 

n=(w„) > F=(w„_2), and n^(u7„) > 

It might be useful to see some examples of these transformations. For instance, to obtain a lower bound of ^^[OOl, Oil], 
one considers: 

w„[001,011] > F=(u.„_2)[001,011] 

= 1 + ^ (w„_2[010, 110] + u;„_2[010, 111] + w„_2[011, 110] + w„_2[011, 111]) . 

And to bound w„ [001, 111], 

w„[001, 111] > i^^(m„_i)[001, 111] 

= i max{w„_i[010, 111] + 'u;„„i[011, 111], 'u;„_i[001, 110] + w„_i[001, 111]} . 
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3.3 Longest common subsequence of d words over general alphabets 

In this section we extend Lueker's lower bound arguments as described in the previous section to the general case of d 
strings whose characters are uniformly and independently chosen over an alphabet of size a. 

Let Xi , . . . , Xd be a collection of d independent random strings chosen according to Un,a and let Ai , . . . , be 
a collection of d finite sequences over the same alphabet. We now consider: 



Wn{Au...,Ad)=^ 



max L{AiXi[l..ii], AdXd[l..id]) 

ii + ...+ia=n 



This quantity represents the expected length of a LCS of d words with prefixes Ai, . . . ,Ad respectively and d suffixes 
whose lengths sum up to n and whose characters are uniformly and independently chosen in S = {1, . . . , cr}. Since 
Wn{Ai, . . . , Ad) and Dn,a,d behave similarly as n ^ oo. Theorem 1 implies that for all ^i, ... , Ad, 

Wr,d{Ai.---,Ad) 

■ya.d = lim . (5) 

Just as in the d = 2 case, fix Z S N and denote by Wn the cr''^ dimensional vector whose coordinates are all the 
values of Wnd{Ai , . . . ,Ad) when Ai, . . . ,Ad vary over all sequences in E'. We again seek a lower bound for w„ as 
a function of vectors w,„, with m < n. 

It is easy to see that if all the strings Ai, . . . ,Ad start with the same character, then: 

w^[Au...,Ad]>l + j^Y. Wn-d[TiAM^), T(A2)c{2), . . . , T{Ad)c{d)]. 

Informally, the previous inequality asserts that if all the words start with the same character then the expected length 
of the LCS of all of them, allowing n random extra characters, is at least 1 (the first character) plus the average of the 
expected length of the LCS of the words obtained by eliminating the first character and "borrowing" d of the n random 
characters. 

If not all the words start with the same character, we can still find a lower bound, but to write it down we need to 
introduce some additional notation. For any two sets X and Y we follow the standard convention of denoting by Y-^ 
the set of all mappings from X to Y. Also, for a d-tuple of strings A = {Ai, . . . , Ad) and z G S we denote by Nz{A) 
the set of indices j G {1, . . . ,d} such that Aj's head is not equal to z, i.e. to the set of string indices not starting with z. 
For a mapping c : Nz{A) ^ S we define Tz{A, c) as the the d-tuple of strings obtained from A by replacing each 
string Ai that does not start with z by the sequence obtained by eUminating its first character and adding the character 
c{i) at its tail. Formally, Tz{A, c) = {A[, . . . , A'j) where 



A' 



U„ if h{A,)=z, 

\t{A,)c{i), if h{A,)^z. 



A crucial fact is that for a d-tuple of strings A, if its coordinates do not all start with the same character, then 



Wn[A] > max i^jy^^^), ^n-\N,(A)\[Tz{A, c)]. 



Informally, each term over which the maximum is taken corresponds to the expected length of the LCS of the strings 
one would obtain by disregarding all first characters of sequences not starting with z, and concatenating to the tail of 
these strings an element randomly chosen over the alphabet E. 

For the sake of illustration, consider the following example of the derived inequalities when ct = 2 and d = 4: 

w;„[001,011,101,001] > maxji ^ «;„_i [001, Oil, 01c(3), 001], 

^ ce{o, !}{■■'> 



^ ^ «;„_3[01c(l),llc(2),101,01c(4)]|. 



23 

ce{o.i}{i-2,*} 
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In the previous example only the third string over which w„ is evaluated does not start with 0. Hence, the first term 
over which the maximum is taken is the average of the values of evaluated at the two possible 4-tuples of strings 
obtained from A by removing the initial 1 from the third string and adding a or 1 final character. On the other hand, 
Wn is evaluated at three strings that do not start with a 1. Hence, the second term over which the maximum is taken is 
the average of the values of Wn~3 over all the 4-tuples of strings obtained from A by removing all the initial O's and 
adding a or 1 final character to those same strings. 

Expressing all the derived inequalities in vector form we have that there is a function F : {'R'' ^ R'" such 

that 

Wn > F{w„-i,Wn-2, ■ ■ ■ ,W„-d)- (6) 

For the ensuing discussion it will be convenient to rewrite F in an alternative way. For each z G S we define the linear 
transformation F^ : (R'^ Y R'^ such that 

E -|^_-(A)|[r4Ac)], if|A^.(^)I^O, 

if I TV, (A) I =0. 

Then, if we let b e R"^'" be the vector with value 1 in the coordinates associated to d-tuples of strings of length I 
starting all with the same character and in the rest of the coordinates, F can be expressed as: 

F{vi, ...,Vd) = & + maxF,(i;i, . . .,Vd)- (8) 



3.4 Finding a lower bound for 7^ ^ 

In the preceding section we established that for any d-tuple of strings A ~ {Ai, . . . , Ad), each of length I, we have 
1a,d = lim„_^oo Wnd[A]/n. To lower bound this latter quantity one is tempted to try the following approach: (1) 
For a fixed word length /, compute explicitly wq, . . . , Wd~i, and, (2) Define a new sequence of vectors (u„)„gN as 
Vi = Wi for < z < d — 1, and then iteratively define w„ = F{vn~i,Vn-2, ■ ■ ■ ,Vn-d), for all n > d. Since F 
is monotone and by (6), we have that u„ < w„ for every n G N. It is natural to fix an arbitrary rf-tuple of strings 
A = {Ai , . . . ,Ad) and estimate a lower bound for j^r.d by lim„^oo Vnd [A] / n for large enough n. 

Unfortunately, for the approach discussed in the previous paragraph to work one would need to determine for 
which values of n the quantity is effectively a lower bound for ^cr,d- Indeed, ?j„d[^]/?^ does not even need 

to be increasing and Wnd[^/n equals 7^,^ only in the limit when n 00. We will pursue a different approach that 
relies on the next lemma which is a generalization of an observation by Lueker [Lue03] for the d = a = 2 case. 

Lemma 4 Let T : (R"^ Y ^ ^ transformation that satisfies the following properties: 

1. Monotonicity : If the inequality (wi, V2^ • ■ • , Vd) < (wi, UI2, ■ • ■ , Wd) holds component-wise, then the inequality 
T{vi,V2, ■ • ■ , Vd) < T{wi,W2, • ■ • , Wd) also holds component-wise. 

2. Translation invariance: Let 1 be the vector of ones in R'^" and 1 = (1, . . . , 1) /je the vector of ones in 
(R'^")''. Then, for any r G Rand for all (ui,D2, ■■■,Vd) e {R'^'")'^, 

T{{vi,V2, ...,Vd) + ri) = T{vi, ...,Vd)+ rl. 

3. Feasibility: There exists a feasible triplet for JF, i.e. a {u, r, e) with u G R'^", r G R, and < e < r such that: 

T{u + (fi-l)rl, . . . , M + 2rl, u + rl, m) > u + {dr - e)l. 

Then, for any sequence (fn)„GN of vectors in R'^ such that u„ > J-{vn-i, ■ • ■ , Vn-d) for all n > d, there exists a 
vector uq in R"^ such that for all n > 0, 

I'n > uq + n{r - e)l. (9) 
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Proof: Let be a transformation satisfying the hypothesis of the lemma and (m, r, e) a feasible triplet for T. Let 
{vn)neN be a sequence of vectors as in the lemma's statement and let a G R be large enough so that for all j < d — 1, 

Vj + al > u + j(r — e)l. 

For example, set a to be the largest component of the vector maxo<j<ci-i(w + j{i — £)1 — Vj)- 

Note that uo = u — al satisfies (9) for all n < d — L We will prove by induction that this holds for all n G N. 
Suppose that (9) holds up to n — 1. Using the inductive hypothesis we have: 

(u„_l, . . . , Vn-d) 

> (uo + (n-l)(r-e)l, . . . ,uo + {n-j){r-e)l, ...,uo + {n-d){r-e)l) 
= {u+ (d-l)rl, ...,u+ id-j)rl + (j-l)el, ...,u+ (d-l)el) + 

{(n-d){r-e) - {d-l)e - a)! 

> (u+ (d-l)rl, ...,u+ {d~i)rl, ...,u) + {{n-d){r-e) - (d-l)e - a)l. 

Evaluating at the terms on both sides of the previous inequality we get, by monotonicity and translation invariance, 
that 

Vn > ^{Vn-1, • ■ • , Vn-d) 

> T{u + (rf-l)rl, . . . , M + ((i-j)rl, ...,u) + {{n-d){r-s) - {d-l)e - a)l. 

Since (u, r, e) is a feasible triplet, it follows that: 

Vn>u + [dr - e)l + {{n-d){r-e) - [d-l)e - a)l 
= u — al + n{r — e)l ^ uq + n{r — e)l. 

This completes the proof. ■ 

From i^'s definition it easily follows that F is monotone and invariant under translations. If we find a feasible triplet 
(u, r, e) for F then, by Lemma 4, we can conclude that the sequence of vectors (t«,i),ieN satisfy Wn > uo + n{r — e)! 
for all n. It follows from (5) that: 

la,d > d{r - e). 

The key point we are trying to make is that in order to establish a good lower bound for ^a,d one only needs to exhibit 
a good feasible triplet, namely one such that [r — e) is as large as possible. 

Empirically, one observes that for any set of initial vectors wq: ■ • ■ > Vd-i, if one makes Vn+d = F{vn+d-l^ ■ ■ ■ , Vn) 
for all n e N, then the sequence {v„)n£N is such that Vn/n seems to converge to a vector with all its components 
taking the same value. In fact, one observes that for large values of n the vectors u„ and Vn+i differ essentially by 
a constant (independent of ??) times the all ones vector Roughly, there exists a real value r such that — Vn is 
approximately rl for all large enough ri. Since, by definition Vn+d — F{vn+d-i, ■ ■ ■ , Vn+i, this impHes that 

F{vn + (d-l)rl, v„ + (d-2)rl, ...,«„+ rl, w„) w„ + drl. 

It follows that one possible approach to find a feasible triplet is to consider an n large enough so that the difference 
between f „ and Vn-i is essentially a constant times the all ones vector Then, set u = Vn, and define r as the maximum 
value such that w„ — Vn-i > rl and e as the minimum possible value such that the triplet (u, r, e) is feasible for F. 
The following result validates the approach just described. 

Lemma 5 Let T : (R"^ — > R'^ be a monotone and translation invariant transformation. Let vq, ■ ■ ■ ,Vd-i G 
R*^ and Vn+d = ^{^n+d-i, ■ ■ ■ ^Vn+i,Vn) far all n S N. If far some r G R, ?t,o > 1 and e > we have 
I \vn+i — Vn — rl\\oo < e/2d for all n G {no, . . . , no+d—1}, then [vno , t, e) is a feasible triplet for T. 
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Proof: First, observe that the monotonicity and translation invariance property of T implies that 

||J^(a;o,...,a;d_i) -J^(yo,---,yd-i)||oo < max - ?/i||oo • 

2— 0,...,a— 1 

Let u = and note that ||wno+i — {u + zrl)||oo < ie/2d < e/2 for < i < d. Hence, by definition of Vng+d, 

\K„+d-:F{u+{d-l)rl,u+{d-2)rl,...,u + rl,u)\\oo < e/2. 
Since ||u„Q-|_rf — (it + (irl)||oo < e/2 it follows that 

\\{u + drl) - T{u + {d-l)rl,u + {d-2)rl, . . . ,u + rl,u)\\oo < e. 
In other words, (u, r, e) is a feasible triplet for T. ■ 

It is easy to check that F satisfies the hypothesis of Lemma 5. This justifies, together with the empirical observation 
that Vn+i — Vn is approximately rl for large values of n, the general approach described in this section for finding a 
feasible triplet for F, and thus a lower bound for ^a,d- It is important to stress here that there is no need to prove the 
convergence of w„/n to rl in order to establish the lower bound > di^ — £)■ We only need to find a feasible 
triplet [u, r, e) for F. The characteristics of F, empirical observations and Lemma 5, efficiently lead to such feasible 
triplets. 



3.5 Implementation and results. New bounds 

In this section we describe the procedure we implemented in order to find a feasible triplet {u, r, e) for F and, as a 
corollary, a lower bound for jcr,d- The procedure is called FeasibleTriplet, it is parameterized in terms of the 
number of sequences d and the alphabet S, and its pseudocode is given in Algorithm 1 . In order to implement F we 



Algorithm 1 Procedure for computing a feasible triple for F 



1: procedure FEASiBLETRiPLETd,s(/, n) >l eN parameter, n eN iteration steps 

2: for i = 0, . . . , d — 1 do 

3: Vi ^ t> Where denotes the vector of zeros in R'^ 

4: end for 

5: (w,r,e) ^ (uo,0,0) 
6: for i ~ d, . . . ,n do 

7: ^ F{vi^i,v^^2, ■ ■ ■ ,Vi^d) 

8: max^g(s')'i i^i - Vi-\){A\ 

9: W ^v^+ dm - F{v, + ...,V^+ Rl, Vi) 

E <— max{0, max^g(si)d 
itR-E>r-e then 

(u,r,e) ^ iui,,R,E) 



end if 
end for 
return {u, r, e) 
end procedure 



rely on the characterization given by (7) and (8). Since the F^'s are linear transformations, they can be represented as 
matrices. This allows for fast evaluation of the F^'s, but requires a prohibitively large amount of main memory for all 
but small values of a, I and d. In order to optimize memory usage, we use the fact that by distinguishing (7) according 
to the cardinality of N,{A) where A E (S')'', F^ can be written as: 

F^{vi, . ..,Vd) = ^-Fz.i(wi) + . . . + -^F^^divd), 
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where 

r ^drM,^)], if \NM)\=i, 

[O, otherwise. 

Note in particular that every Fz^i can be represented as a 0-1 sparse matrix. 

In our experiments we ran Algorithm 1 for different values of / and alphabet sizes a. As one would expect, the 
derived lower bounds improve as I grows. However, the memory resources required to perform the computation also 
increases. Indeed, throughout the second loop of Algorithm 1 we need to store d vectors of dimension cr'''. Also, a 
simple analysis of the definition of the sparse matrix Fz^i shows that it has (^)(T^'~^^''(cr — 1)V* non-zero entries. 
It follows that a sparse matrix representation of Fz has roughly (t''^(ct — 1)'' non-zero entries. Hence, the necessary 
computations are feasible only for small values of cr, I and d, unless additional features of the matrices involved are 
taken advantage of in order to optimize memory usage. 

Table 1 summarizes the lower bounds we obtain for 7^.2 and contrasts them with previously derived ones. To the 
best of our knowledge, for the d = 2 case and alphabet sizes 3, 4, 5, and 6, this work provides the currently best known 
lower bounds for ja.2- It might be worth mentioning that, as can be seen in that table, the bound of [Dan94, Dek79] is 
better than the bound of the more recent work of [BYNGS99] for alphabet size 6, and that for bigger alphabet sizes, 
the bound of [Dan94, Dek79] is still better than ours. 

The best known lower bound for 72.2 is still the one established by Lueker [Lue03]. Table 2 lists the distinct 
choices of a and d for which we could execute Algorithm 1 and indicates the value of the parameter / giving rise to 
the reported lower bound. 



a 


7<T,2 


This work 


Baeza et. al. lower bound [B YNGS99] 


Dancik-Deken's lower bound [Dan94, Dek79] 


3 


0.671697 


0.63376 


0.61538 


4 


0.599248 


0.55282 


0.54545 


5 


0.539129 


0.50952 


0.50615 


6 


0.479452 


0.46695 


0.47169 


7 


0.444577 




0.44502 


8 


0.356545 




0.42237 


9 


0.327935 




0.40321 


10 


0.303490 




0.38656 



Table 1: Best known lower bounds for 7ct_2 (in boldface). 



3.6 Disproving Steele's 72,2 = 72,3 speculation 

We showed in Section 2 that 72, d > 72^^ for all d > 5. We now establish that this is also the case when d = 3 and 
d = A. Recall that Lueker [Lue63] proved that 72,2 <UfovU^ 0.826280. From Table 2 we see that for d = 3 and 
d ~ A, the indicated lower bound for 72, d is strictly greater than U'^~'^, and therefore, is also strictly greater than 72^^. 
This implies that 72, d > 72 2^ for d = A and = 3 as claimed. Together with the results of Section 2 this establishes 
that 72, d > 72.2^ for all d>3. 

4 Final comments 

As already mentioned at the start of this paper, Steele [Ste86] pointed out that it would be of interest to find relations 
between the values of the 7cr,d's, especially between 72,2 and 72,3. We think it would be very interesting if such a 
relation would exist. In fact, it might shed some light upon the longstanding open problem of determining the exact 
value of the Chvatal-Sankoff constant. 
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Alphabet size a = 


2 


d 


L such that '-f2,d > 


Parameter / 


2 


0.781281 


10 


3 


0.704473 


7 


4 


0.661274 


5 


5 


0.636022 


4 


6 


0.617761 


3 


7 


0.602493 


2 


8 


0.594016 


2 


9 


0.587900 


2 


10 


0.570155 




11 


0.570155 




12 


0.563566 




13 


0.563566 




14 


0.558494 







Alphabet size cr 


= 3 




L such that 73 ^ > L 


Parameter I 


2 


0.671697 


6 


3 


0.556649 


4 


4 


0.498525 


3 


5 


0.461402 


2 


6 


0.421436 


1 


7 


0.413611 


1 


8 


0.405539 


1 






Alphabet size a 


= 4 


d 


L such that 74 ^ > L 


Parameter I 


2 


0.599248 


5 


3 


0.457311 


3 


4 


0.389008 


2 


5 


0.335517 


1 


6 


0.324014 


1 



Table 2: 





Alphabet size cr 


= 5 




L such that 75 ^ > L 


Parameter I 


2 


0.539129 


4 


3 


0.356717 


2 


4 


0.289398 


1 


5 


0.273884 


1 






Alphabet size a 


= 6 




L such that 75, d > L 


Parameter I 


2 


0.479452 


3 


3 


0.309424 


2 


4 


0.245283 


1 





Alphabet size cr 


= 7 


d 


L such that 77. ^ > L 


Parameter Z 


2 


0.444577 


3 


3 


0.234567 


1 


4 


0.212786 


1 





Alphabet size cr = 


8 


d 


L such that 78 ^ > L 


Parameter I 


2 


0.356545 


2 


3 


0.207547 


1 






Alphabet size cr = 


9 




L such that 79, ^ > L 


Parameter I 


2 


0.327935 


2 


3 


0.186104 


1 






Alphabet size a ~ 


10 


d 


L such that 710. d > L 


Parameter I 


2 


0.303490 


2 


3 


0.168674 


1 



bounds for ja^d 



Lacking a relation among the "fa,d's it would still be interesting to relate these terms to some other constants 
that arise in connection with other combinatorial problems. A step in this direction was taken by Kiwi, Loebl and 
Matousek [KLM05] who showed that \/a^a,2 — >■ C2 when ct ^ 00, where C2 is a constant that turns up in the study of 
the Longest Increasing Sequence (LIS) problem (also known as Ulam's problem). Specifically, C2 is the limit to which 
the expected length of a LIS of a randomly chosen permutation of {1, . . . , n} converges when normalized by ^/n. 
Logan and Shepp [LS77] and Vershik and Kerov [VK77] showed that C2 = 2. Consider now the following experiment: 
Choose n points in a unit d-dimensional cube [0, 1]'' and let Hd{n) be the random variable corresponding to the length 
of a longest chain (for the standard partial order in R'') of the n chosen points. Bollobas and Winkler [BW88] proved 
that there are constants Cj, Cg, . . . such that c'^ < e, Mmd-too = e and lim„_,oo Hd{n)/'n}/''- = c^. By labeling 
a set S of points in [0, 1]^ in increasing order of their x coordinate and reading the labels in the order of their y 
coordinates one can associate a permutation tt to the set S. It is easy to see that a chain of points in S is in one to 
one correspondence to an increasing sequence of tt. Hence, it follows that = C2. Soto [Sot06] extended the results 
of [KLM05] and showed that cr^^^/''7cr,d c'^ when cr ^ 00. We think that any similar type of result, or even a 
reasonable conjecture, that would hold for fixed a and d would also be quite interesting. 
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